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ARTICLE 



Custom oligonucleotide array-based CGH: a reliable 
diagnostic tool for detection of exonic copy-number 
changes in multiple targeted genes 

Aurelie Vasson 1 , Celine Leroux 1 , Lucie Orhant 1 , Mathieu Boimard 1 , Aurelie Toussaint 1 , Chrystel Leroy 1 , 
Virginie Commere 1 , Tiffany Ghiotti 1 , Nathalie Deburgrave 1 , Yoann Saillour 2 , Isabelle Atlan 1 , 
Corinne Fouveaut 1 , Cherif Beldjord 1 ' 2 , Sophie Valleix 1 ' 2 , France Leturcq 1 ' 2 , Catherine Dode 1 ' 2 , 
Thierry Bienvenu 1 ' 2 , Jamel Chelly 1 ' 2 and Mireille Cossee*' 1 ' 2 

The frequency of disease-related large rearrangements (referred to as copy-number mutations, CNMs) varies among genes, and 
search for these mutations has an important place in diagnostic strategies. In recent years, CGH method using custom-designed 
high-density oligonucleotide-based arrays allowed the development of a powerful tool for detection of alterations at the level of 
exons and made it possible to provide flexibility through the possibility of modeling chips. The aim of our study was to test 
custom-designed oligonucleotide CGH array in a diagnostic laboratory setting that analyses several genes involved in various 
genetic diseases, and to compare it with conventional strategies. To this end, we designed a 12-plex CGH array (135k; 
135 000 probes/subarray) (Roche Nimblegen) with exonic and intronic oligonucleotide probes covering 26 genes routinely 
analyzed in the laboratory. We tested control samples with known CNMs and patients for whom genetic causes underlying their 
disorders were unknown. The contribution of this technique is undeniable. Indeed, it appeared reproducible, reliable and 
sensitive enough to detect heterozygous single-exon deletions or duplications, complex rearrangements and somatic mosaicism. 
In addition, it improves reliability of CNM detection and allows determination of boundaries precisely enough to direct targeted 
sequencing of breakpoints. All of these points, associated with the possibility of a simultaneous analysis of several genes and 
scalability 'homemade' make it a valuable tool as a new diagnostic approach of CNMs. 
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INTRODUCTION 

Large intragenic deletions and duplications are frequent causes of 
genetic disorders. The term of copy-number mutations (CNMs) is 
used, from the standpoint pathological relevance, in distinction to the 
neutral term 'copy-number variation' CNV, in accordance with 
previous studies. 1 Thus, in the diagnostic context, search for CNMs 
represents a major purpose, particularly if they are frequent such as 
for Duchenne or Becker dystrophies (DMD/BMD) in which they 
represent 60-70% of mutations. 2 ' 3 However, in diseases for which 
CNMs are rare, their search is important for diagnostic confirmation 
if screening for point mutation is negative or incompletely conclusive. 
For example, in cystic fibrosis (CF) CNMs represent 1.5% of known 
CFTR mutations and are searched for in patients heterozygous for 
point mutations in CFTR. 4 

Several scanning methods for CNMs, including Southern blotting, 
multiplex amplifiable probe hybridization (MAPH), quantitative 
multiplex PCR of short fluorescent fragments (QMPSF), multiplex 
ligation- dependent probe amplification (MLPA) are currently used 5-7 
but they have several disadvantages. The preparation of high-quality 
Southern blot is technically demanding, is time consuming and its 



sensitivity is generally low and limited to very large CNMs. 8 It has 
been shown that QMPSF approach used in several studies was 
inherently biased in favor of the detection of deletions over 
duplications, suggesting that a change of copy-number from two to 
one (in the case of heterozygous deletions) is more readily identifiable 
than change from two to three (in the case of heterozygous 
duplications). 4,9 MLPA approach seems more reliable as it is based 
on hybridization of probes to genomic regions of interest. However, 
this technique is available in commercial kit forms in which the 
number of probes is usually limited to one per exon. For diseases with 
rare CNMs, commercials kits such as MLPA are not available and 
scanning methods are represented by 'home made' techniques 
such as semi-quantitative-fluorescent-PCR (QF-PCR) and real-time 
quantitative-PCR (Q-PCR) that are time consuming for 
implementation. As PCR-based techniques and MLPA are restricted 
to a limited number of targeted sequences, they can fail to detect 
some rearrangements; conversely false-positive single-exons losses 
can result from single nucleotide polymorphisms (SNPs) affecting 
primer or probe sequences. 10 Finally, these scanning methods do not 
allow determination of CNM boundaries, and a limited number of 
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genes can be concomitantly analyzed. In general, each gene requires 
its specific kit or 'home-made' technique for CNM detection. 

The recent emergence of array comparative genomic hybridization 
has revolutionized the ability to identify CNMs associated with 
various diseases. This approach was first used for detecting large 
CNMs at the scale of multiple contiguous genes in whole genome 
analysis. 11 In recent years, the development of oligonucleotide probes 
for hybridization on microarrays allowed to explore smaller CNMs at 
the scale of exons. Single-locus arrays were first validated to detect 
exonic and intronic CNMs within the DMD locus and the CFTR 
locus in patients suspected of having mutations in these genes. 1 ' 8,10,12 
Thereafter, several teams (including our) implemented multiple gene 
CGH arrays. 13-1 5 Custom oligonucleotide CGH array emerged then as 
a powerful tool for high-resolution detection of genomic CNMs, with 
the flexibility provided through customized array designs. The recent 
possibility of increasing the density of probes loaded on chips allowed 
the emergence of high-density (HD) chips arrays thereby increasing 
the number of genes tested. 16 

In our laboratory of Molecular Biology, Cochin Hospital (Paris), 
we are performing molecular diagnosis of several genetic diseases with 
various clinical aspects, mutational spectrum and modes of inheri- 
tance. Some of them are known to be caused, in different proportions, 
to CNMs (Table l). 1 ^' 17-33 Conventional methods used to search for 
CNMs are mainly based on QF-PCR and MLPA. In order to replace 
the time-consuming current scanning methods by a unique technique 
for CNMs detection, we first developed a 72k four-plex array covering 
the 158 exons of eight disease-related genes: DMD, sarcoglycan genes 
and CFTR. 13 Exonic copy-number changes were identified with a high 
resolution, as abnormalities of about 1.5-2 kb could be detected, as 
well as a mosaic deletion. 13 We then wanted to implement this 
approach to the totality of genes tested in our laboratory that are 
known or suspected to be prone to CNMs (26 genes). We took 
advantage of the advent of HD chips to develop a custom-targeted 
12-plex 135K oligonucleotide-based CGH array covering the 344 
exons of these genes. We report implementation and validation of this 
array and its applications in a diagnostic setting to test different 
diseases with great clinical and genetic heterogeneity. 

PATIENTS AND METHODS 

Patients and control samples 

The patients were referred to the Laboratory of Molecular Biology, Cochin 
Hospital, for molecular analysis of the genes of interest. Reference DNAs were 
obtained from control patients with a well characterized deletion of exons 7 
and 8 of the CFTR gene. 13 Genomic DNAs were extracted from leukocytes 
using standard procedures (phenol extraction or Wizard Genomic DNA 
Isolation System, Promega, Madison, WI, USA). 

CGH array 

Microarmy construction. We devised, with Roche Nimblegen support, 12-plex 
oligonucleotide-based CGH arrays to explore the whole 26 genes including 
promoters. The design was made by taking in consideration size and 
constitution of each gene (Table 2). Using the data from the human genome 
(http://genome.ucsc.edu/cgi-bin/hgGateway, NCBI36/hgl8), 135 000 oligonu- 
cleotide probes covering all genomic regions plus 2 kb at each extremity of the 
26 genes were designed for each subarray. On the basis of experimental results, 
different average tiling intervals (ie, spacing between 5' ends of probes) were 
selected (Table 2). 'Backbone' probes covering chromosomes corresponding to 
the genes of interest were added at a lower density (spacing of 20-25 kb) in 
intergenic regions. All probes have similar characteristics: isothermal probes 
with melting temperature (Tm) of 76 °C and average probe length 60 bases. To 
avoid cross-hybridization, all probes were compared with the entire hgl8 
genome using Basic Local Alignment and Search Toll. Any probe that did not 



map uniquely was removed except those in the pseudoautosomal regions on 
chromosome X and Y for which two locations were tolerated. Roche 
NimbleGen manufactured the array (www.nimblegen.com). Sequences of the 
135000 probes are available on request. 

Fluorescent DNA labeling, microarmy hybridization. DNA concentration and 
quality were evaluated by NanoDrop and agarose gel migration. The reference 
DNA used for each patient's DNA was extracted using the same technique. We 
did sex match between each sample and reference DNA. Each DNA sample 
( 1 /Jg) was labeled using a NimbleGen Dual-Color DNA labeling Kit as 
described previously 13 according to the manufacturer's protocol (Roche 
NimbleGen). After denaturation, hybridization was carried out on a 
NimbleGen Hybridization System for 40 h at 42 °C. The array was then 
washed by using NimbleGen Wash System (Roche NimbleGen), dried by 
centrifugation and scanned at 2 /im resolution by using InnoScann900 scanner 
(Innopsys, Toulouse, France). 

Data analysis. Fluorescence intensity raw data were obtained from scanned 
images of the oligonucleotide arrays by using NimbleScan 2.6 extraction 
software (Roche NimbleGen). For each spot on the array, log 2 ratio of the Cy3- 
labeled test sample versus Cy-5 reference sample was calculated and visualiza- 
tion of the results was obtained using the Signal map software (Roche 
NimbleGen). Quality of the experiment was ascertained by the madl.dr value 
(medium absolute deviation of the log 2 ratio difference between consecutive 
probes) that provides a surrogate measure of experimental noise and should be 
<0.23. 

DNA sequencing 

CGH array results were used to target the genomic region to design primers for 
sequencing breakpoints. Primers were first selected within a 500-bp interval 
and further in a 1-kb interval if necessary. Oligonucleotide primer pairs were 
designed with the help of the Primer3Plus online tool (http://www.bioinfor- 
matics.nl/cgi-bin/primer3plus/primer3plus.cgi). Sequences of primers are 
available on request. Bidirectional sequencing of the purified PCR products 
was performed on an Applied (3130XT) automated capillary sequencer 
(Applied Biosystems, Foster City, CA, USA) (protocols available on request). 

RESULTS 

Development of a custom 12-plex CGH array for CNM detection in 
26 genes 

In order to define the CGH array design that allows analysis of the 
greatest number of genes despite the constraint related to the possible 
number of probes, we performed preliminary experiments and 
compared four different designs; design 1: exons and introns were 
homogeneously covered by evenly distributed probes, with tiling space 
of 24 bp; design 2: only exons were covered by probes, and each exon 
was covered by 30-40 probes; design 3: exons and flanking intronic 
sequences were covered by 30-40 probes, and introns were covered by 
probes with spacing of 600 bp; design 4: exons and flanking intronic 
sequences were covered by 30-40 probes, and introns were covered by 
probes with a spacing of 100 bp. These experiments showed that 
design 2 was not suitable because of deviations from the expected 
baseline of log 2 ratio corresponding to several consecutive probes. 
Design 3 and design 4 that require a much less number of probes than 
design 1 were appropriate to efficiently detect exonic CNMs. It 
appeared through these experiments that inclusion of a set of 
backbone probes covering intergenic regions would also contribute 
to overcome baseline irregularity. 

We selected 26 genes routinely analyzed in our laboratory involved 
in heterogeneous groups of disorders and prone or suspected to be 
prone to CNMs (Tables 1 and 2). Genomic size and organization are 
highly variable between these genes, from the small tubulin genes with 
their four exons spanning about 4kb of genomic DNA to the DMD 
gene spanning on 2.2 Mb and composed of 79 exons (Table 2). We 
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Table 1 Diseases selected and indications for CGH array analysis 
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Disease 

Myopathies 
Dystrophinopathies 



Sarcoglycanopathies 



Emery Dreifuss 
syndrome 



Inheritance 



XR 



AR 



AR 



AR 



AR 



XR 



Gene 



DMD 



SGCA 
(alpha-SG) 

SGCB 
(beta-SG) 

SGCG 
(gamma- 
SG) 
SGCD 
(delta-SG) 

EMD 



Reported frequency of CNMs 



70% 



Rare (few cases) 



Very rare (few cases) 



Rare (few cases) 



No reported case 



Few cases 



Reference 



Indications for CGH analysis 



Leiden muscular 

dystrophy pages (http:// 
www.dmd.nl/) 2 ' 3 
Leiden muscular 

dystrophy pages (http:// 
www.dmd.nl/) 17 
Leiden muscular 

dystrophy pages (http:// 
www.dmd.nl/) 
Leiden muscular 

dystrophy pages (http:// 
www.dmd.nl/) 16 ' 17 



Leiden muscular 
dystrophy pages (http:// 
www.dmd.nl/) 18 



Diagnosis: first molecular screening (all index cases) 
Determination of boundaries 

Patients with only one point mutation identified, or 
abnormal immunolabeling and no point mutation 
detected 

Patients with only one point mutation identified, or 
abnormal immunolabeling and no point mutation 
detected 

Patients with only one point mutation identified, or 
abnormal immunolabeling and no point mutation 
detected 

Patients with only one point mutation identified, or 
abnormal immunolabeling and no point mutation 
detected 

Typical clinical syndrome, no point mutation in the 
EMD gene nor in LAMA2 gene 



Mental retardation 
Rett syndrome (RTT); 
Neonatal encephalopa- 
thy in males 
Rett variant with early 
epilepsy 

RTT variant with con- 
genital form 
Rett-like syndrome 

Rett-like syndrome 

Fragile X syndrome 



Mental retardation 
because of ARX 
Lissencephalies and 
other cortical brain 
malformations 



XD 

XD 

AD, de 
novo 
AD 

AD 

XD 

XR 

XR 

XD 

XR 
AD 
AD 
AD 
AD 
AD 
AD 



MECP2 

CDKL5 
FOXG1 
Netrin Gl 
JNK3 
FMR1 
FMR2 
ARX 
DCX 

OPHN1 

LIS1 
(Pafahlbl) 
TUBA1A 

TUBB2B 

TUBB3 

TUBB6 

TUBB5 



5% of females with RTT 2% of 
males with severe encephalopathy 
Large CNMs 
Rare, >20 cases 

Rare, <20 cases 

1 Case of translocation 

1 Case of translocation 

Rare (deletions) 

Rare (deletions) 

Rare (deletions) 

Deletions and duplications 
described (large CNMs) 

Rare 

60% (Deletions) (large CNMs) 



RTT females without point mutation in the MECP2 
gene. First molecular screening in males with severe 
encephalopathy 

Atypical RTT Females without point mutation in the 

MECP2 and CDKL5 genes 

Congenital variant of Rett syndrome without point 

mutations in the MCEP2 and FOXG1 genes 

Typical and atypical RTT patients without mutations in 

the MECP2, CDKL5 and FOXG1 genes 

Typical and atypical RTT patients without mutations in 

the MECP2, CDKL5 and FOXG1 genes 

Patients without expansion and with highly evocative 

phenotype 

Patients without expansion and with highly evocative 
phenotype 

Patients with no point mutation and with evocative 
phenotype 

Patients with no point mutation and with evocative 
phenotype. Determination of boundaries 

Patients with no point mutation and with evocative 
phenotype 

Patients with no point mutation and with evocative 
phenotype. Determination of boundaries 
Patients with no point mutation and with evocative 
phenotype 

Patients with no point mutation and with evocative 
phenotype 

Patients with no point mutation and with evocative 
phenotype 

Patients with no point mutation and with evocative 
phenotype 

Patients with no point mutation and with evocative 
phenotype 



Other diseases 
Cystic fibrosis or CFTR- AR CFTR 
related disorder 

Kallmann syndrome KALI 
Hemophilia A XR F8 



2.5-5% 

10% (large CNMs) 
5-10% 



Cystic fibrosis mutation Patients with cystic fibrosis or CFTR-related disorder 
database (www.genet. heterozygous for a point mutation 
sickkids.on.ca/cftr/) 1,4 

32 ' 33 Males without point mutations in the 5 KAL genes 

Determination of boundaries 
19 Cases without recurrent intron 22 and intron 1 F8 

inversions and without point mutations 
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Table 2 Human disease genes selected and design of the custom CGH array 







Cytogenetic 


Gene location 




Gene size 


Exons 






Gene 


Chromosome 


band 


(UCSC Hgl8-ref. seq gene) 


Ref. seq 


(kb) 


(N) 


OMIM 


Design (average tilling) 


DMD 


X 


Xp21.2 


31047 266-33 267 647 (Dp427c) 


NM_ 000109 


2220 


79 


300377 


50 bp 


SGCA 


17 


17q21 


45 598365-45 608292 


NM_000023.1 


10 


10 


600 119 


15bp 


(alpha-SG) 


















SGCB 


4 


4ql2 


52 581618-52 599 242 


NM_000232 


17,6 


6 


600900 


15bp 


(beta-SG) 


















SGCG 


13 


13ql2 


22 653 060-22 797 304 


NM_000231 


144,2 


8 


608896 


40 bp exons, 300 bp 


(gamma-SG) 
















introns 


SGCD 


5 


5q33-5q34 


155 686 345-156127 376 


NM_000337 


441 


9 


601411 


40 bp 


(delta-SG) 


















EMD 


X 


Xq28 


153 260 791-153 263077 


NM_000117 


2,3 


6 


300384 


15 bp 


MECP2 


X 


Xq28 


152 940458-153 016 382 


NM_004992 


76 


4 


300005 


15bp 


CDKL5 


X 


Xp22.13 


18353 646-18581670 


NM_003159 


228 


21 


300203 


40 bp exons, 300 bp 
introns 


FOXG1 


14 


14ql2 


28306 038-28308622 


NM_005249.3 


2,6 


1 


164874 


15 bp 


NTNG1 


1 


lpl3.3 


107484152-107825998 


NM_001113226.1 


342 


8 


608818 


40 bp exons, 300 bp 
introns 


JNK3 


4 


4q21.3 


87 155 300-87 593 307 


NM_138980.2 


436,7 


14 


602 897 


40 bp exons, 300 bp 


(MAPKIO) 
















introns 


FMR1 


X 


Xq27.3 


146 801201-146840333 


NM_2024.4 


39 


17 


300624 


40 bp exons, 300 bp 
introns 


FMR2 


X 


Xq28 


147 389831-147 889 899 


NM_2025 


500 


22 


300806 


40 bp exons, 300 bp 
introns 


ARX 


X 


Xq21.3 


24931732-24943 986 


NM_007492 


12 


5 


300382 


15bp 


OCX 


X 


Xq22.3q23 


110 423 663-110 542 062 


NM_178152 


118 


8 


300121 


40 bp exons, 300 bp 
introns 


OPHN1 


X 


Xql2 


67 178911-67 570024 


NM_002547 


391 


24 


30127 


40 bp exons, 300 bp 
introns 


LIS1 


17 


17pl3.3 


2443 673-2 535 659 


NM_000430 


92 


10 


601 545 


40 bp exons, 300 bp 


{Pafahlbli 
















introns 


TUBA1A 


12 


12pl3.12 


47 864850^17869128 


NM_006009 


4 


4 


602 529 


15bp 


TUBB2B 


6 


6p25 


3169494-3 172 967 


NM_178012 


4 


4 


612850 


15bp 


TUBB3 


16 


16q24.3 


88 517 188-88530006 


NM_006086 


4 


4 


602 661 


15bp 


TUBB6 


18 


18pl 1.21 


110QQ OKI n Ql C SCO 

12 2yo 25/— 1231b Dbo 


INM_U3252D 


4 


4 




1 5 bp 


TUBB5 


6 


6 




NM_178014 


4 


4 




15bp 


CFTR 


7 


7q31.2 


116 907 253-117 095 954 


NM_000492.3 


189 


27 


602 421 


20 bp 


KALI 


X 


Xp22.3 


8456915-8660 227 


NM_000216 


203,3 


14 


308 700 


40 bp exons, 300 bp 
introns 


F8 


X 


Xq28 


153 717 258-153 904192 (isoform a 
precursor) 


NM_000132 


187 


26 


306 700 


40 bp exons, 300 bp 
introns 


SHOX 


X 


Xp22.33 


505 079-527 558 


NM_000451.3 


22 


5 


312865 


15bp 



50 bp: average tilling 50 bp in exons and introns. 
15 bp: average tilling 15 bp in exons and introns. 
20 bp: average tilling 20 bp in exons and introns. 

40 bp exons, 300 bp introns: average tilling 40 bp in exons, 300 bp in introns. 



developed a custom-targeted oligonucleotide-based CGH array con- 
taining 135 000 oligonucleotide probes covering genomic regions of 
these 26 genes, including their promoters and 2-kb upstream and 
downstream regions. A total of 344 exons, 26 promoters and 
corresponding intronic regions were then covered (Table 2). For each 
gene, the design of the 60mer probes was determined according to the 
characteristics of the gene and the preliminary experiments, in order 
to optimize visualization on CNMs (Table 2). For the 12 smaller genes 
(4-9 kb), exons and introns were covered by a HD of probes (tiling of 
15 bp). For large genes, we chose to cover introns with a lower probe 
density (one probe each 300 bp) than exons (one probe each 40 bp) to 
save space on the array, except for the CFTR and DMD genes. 
For these two genes, intronic coverage was as dense as exonic coverage 



(tiling of 20 bp for CFTR and 50 bp for DMD) to improve 
determination of breakpoints. Backbone probes were added at a 
lower density in intergenic sequences to obtain a stable baseline. 

Validation of the custom CGH array by analysis of control DNAs 

The performance of the custom CGH array was evaluated by using 38 
DNA samples from patients with known CNMs previously identified 
by other techniques (Table 3). They ranged from small CNMs of one 
unique exon to entire gene rearrangements. Control patients 
were hemizygous, homozygous or heterozygous, and three somatic 
mosaicism were included. 

Through the use of strict quality criteria, particularly the madl.dr 
value below 0.23, no false-negative result was observed. All the 38 
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Table 3 Exonic copy-number mutations (CNMs) in DMD, SGs, CFTR, CDKL5, DCX, LIS1, KAL and f8 genes used to validate the custom 
CGH array 



Gene 


N 


Mutation type 


Conventional name 


Name 


Conventional 
methods 


Status 


Size of the CNM deter- 
mined by CGH array 


DMD 


l a 


Duplication 


DMDdup2 


c.32-?_93 + ?dup 


QF-PCR, 


He 


250 kb 












Q-PCR 








2 a 


Duplication 


DMDdup2_34 


c.32-?_4845 + ?dup 


QF-PCR 


He 


772 kb 




3 a 


Duplication 


DMDdup3_9 


c.94-?_960 + ?dup 


QF-PCR 


He 


310.8kb 




4 a 


Duplication 


DMDdupl7 


C.1993-?2168 + ?dup 


QF-PCR 


He 


11.1 kb 




5 a 


Duplication 


DMDdup5_7 


c.265-?_649 + ?del 


QF-PCR 


He 


130 kb 




6 a 


Deletion 


DMDdele45 


c.6439-?_6614 + ?del 


QF-PCR 


He 


151 kb 




7 a 


Deletion 


DMDdele45 


c.6439-?_6614 + ?del 


QF-PCR 


He 


17.4kb 




8 a 


Duplication 


DMDdup48_49 


c.6913-?_7200 + ?dup 


QF-PCR 


He 


57 kb 




9 a 


Deletion 


DMDdele51 


c.7310-?_7542 + ?del 


QF-PCR 


He 


49.8 kb 




10 a 


Deletion (mosaic) 


DMDdele49_52 


c.7099?_7660 + ?del 


QF-PCR, 


He 


106kb 












RT-PCR 








ll a 


Duplication 


DMDdup55 


c.8028-?_8217 + ?dup 


QF-PCR 


Ht 


56 kb 




12 a 


Deletion 


DMDdele46_47 


c.6615-?_6912 + ?del 


QF-PCR 


He 


23.9 kb 




13 


Deletion 


DMDdele53 


c.7661-?_c.7872 + ?del 


QF-PCR 


He 


18kb 




14 


Duplication 


DMDdup7 


c.531-?_649 + ?dup 


QF-PCR 


He 


2.2 kb 




15 a 


Deletion 


DMDdele45_47 


c.6439-?_6912 + ?del 


QF-PCR 


Ht 


149.4 kb 




16 


Duplication 


DMDdup61-62 


c.9085-?_9224 + ?dup 


QF-PCR, 


Ht 


2.3 kb 






(mosaic) 






Q-PCR 






SGCG 


17 a 


Deletion 


SGdele5_6 


c.386-?_578 + ? del/c.386-?_578 + ?del 


QF-PCR 


Ho 


37.2 kb 




18 a 


Deletion 


SGdele3 


c.196-? 297 + ?del/c.l96-? 297 + ?del 


QF-PCR 


Ho 


16.2kb 




19 a 


Deletion 


SGdele7 


c.579-?_702 + ?del/c.579-?_702 + ?del 


QF-PCR 


Ho 


4.2 kb 




20 a 


Deletion 


SGdele7 


c.579-?_702 + ?del 


QF-PCR 


Ht 


4.2 kb 


SGCA 


21 a 


Deletion 


ASGdele7_8 


c.748-?_983 + ?del 


QF-PCR 


Ht 


1.2 kb 


CFTR 


22 a 


Deletion 


CFTRdele3_10,14b_16 


c. 165-?_1584 + ?del;c.2620-?_2988 + ?del 


MLPA 


Ht 


84 kb 




23 a 


Deletion 


CFTRdelel7a_17b 


c.2989-977_3367 + 248del2515 


MLPA 


Ht 


2.5kb 




24 a 


Deletion 


CFTRdelel7a_18 


c.2989-449_3468 + 644del5288 


MLPA 


Ht 


5.2 kb 




25 a 


Deletion 


CFTRdele2_3 


c.54-?_273 + ?del 


MLPA 


Ht 


21 kb 




26 a 


Deletion 


CFTRdele22_23 


c.3964-78_4242 + 577dell532 


MLPA 


Ht 


1.5 kb 


CDKL5 


27 


Deletion 


CDKL5delel 


c.l-?_345 + ?del 


MLPA 


Ht 


294. 4kb 


DCX 


28 


Deletion (mosaic 


DCXdele4 


c.706_5550_808 + 39del 


QF-PCR, 


He 


6.2 kb 






40:60, mutant:WT) 






Q-PCR 








29 


Duplication 


DCXdup4_7 


C.705+ 18032_backbone 


QF-PCR 


He 


81.6kb 




30 


Duplication 


DCXdele2 Turner 


cbackbone_364+ 1054del 


QF-PCR 


Ht 


7.9 kb 








mosaique 20% 










LIS1 


31 


Deletion Mieller 


LISlentire gene 


c.l-?_1233 + ?del 


QF-PCR 


Ht 


740 kb 


(Pafahlbl) 




Diecker 


deletion 












32 


Deletion 


LISlentire gene dele- 


c.l-?_1233 + ?del 


QF-PCR 


Ht 


65.2 kb 








tion AFHlBldele4_ll 












33 


Deletion 


LISlentire gene 


c.l-:_123o+ rdel 


QF-PCR 


Ht 


8.7 kb 








deletion 












34 


Deletion 


LISlentire gene 


c.l-?_1233 + ?del 


QF-PCR 


Ht 


51.8 kb 








deletion 










KALI 


35 


Deletion 


KALI entire deletion 


c.l-?_2043 + ?del 


Southern blot 


He 


2297 kb 




36 




KALI entire deletion 


c.l-?_2043 + ?del 


Southern blot 


He 


1600 kb 




37 




Kal 1 dele3_13 


c.256-?_1984 + ?del 


Southern blot 


He 


91 kb 


F8 


38 


Deletion 


F8dele2_6 


c.l43-?_670 + ?del (new) 


PCR 


He 


38.6 kb 










c.86-?_613 + ?del (old) 









Abbreviations: N°, number; He, hemizygous patients, Ht, heterozygous individuals; Ho, homozygous patients. 

A total of 38 control DNAs were studied in a blind trial. Nomenclature corresponds to the approved nomenclature. For hemophilia there are two nomenclatures: one that takes into account the 
peptide signal (the new nomenclature), one that does not take into consideration the peptide signal (the old nomenclature). Both are still used, to date, for diagnostic reports, and are indicated in 
the table. 

a Control samples tested with the previous CGH array design. 13 
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CNMs were accurately identified and precisely characterized, although 
rearrangements are of different size and scattered in different genomic 
regions (Table 3). The smallest deletion was a heterozygous 1.5 kb 
deletion of exons 22 and 23 within the CFTR gene that span 189 kb of 
genomic DNA (patient 26). We tested several monoexonic CNMs in 
the DMD gene, including a heterozygous duplication. All were 
correctly identified, as well as the frequent small CNV (1.4kb) in 
intron 2 of this gene (Figure la). We also analyzed one DNA sample 
from chorionic villosities with a deletion in the DMD gene, which was 
correctly detected. A heterozygous deletion of exon 1 of the CDKL5 
gene (patient 27), which was difficult to detect by using MLPA 
because it appeared as a dosage reduction of a single peak among 45 
signals, was easily visualized using CGH array, as it extends in fact on 
a large distal genomic region of 294 kb (Figure 2). 

To assess the threshold of CNM detection with this procedure, we 
analyzed DNAs from three controls with somatic mosaicism. We 
reanalyzed DNA from a BMD patient with mosaicism in the DMD 
gene previously tested with the 4-plex design (patient 10, exons 49-52 
deletion, 80:20 mutant:wild-type (WT)), 13 this mosaic deletion was 
correctly detected (data not shown). We also analyzed a male patient 
with lissencephaly who had a somatic mosaicism for a deletion in the 
DCX gene (patient 28; 40:60, mutantWT). The deletion was detected 
with a log 2 ratio of —0.4 (data not shown). To refine our evaluation, 
we then tested DNA from a female carrier with a heterozygous 
duplication within the DMD gene in approximately 50% of cells 
(sample 16). CGH array analysis identified clearly the mosaicism as a 
gain of signal with a log 2 ratio of + 0.3, instead of the 0.4 expected 
value for a heterozygous female (Figure 3). 



Finally, relevance of inclusion of backbone probes was evaluated 
through analysis of deletions encompassing entire genes and extend- 
ing outside of the gene regions. We tested four DNA samples from 
patients with lissencephaly because of heterozygous deletions of the 
whole LIS1 gene (patients 31-34). The deletions were clearly 
identified, and boundaries roughly determined because of the low 
density of backbone probes in intergenic regions. The larger deletion 
was identified in a patient with Miller-Dieker syndrome, a contiguous 
gene disorder because of chromosome 17pl3.3 deletion including 
LIS1 (Figure 4a). We also analyzed two samples from patients with 
Kallmann syndrome because of large deletions of the entire KALI 
gene (patients 35 and 36), including one with associated ocular 
albinism (patient 35). CGH array detected a large deletion of 2297 kb 
encompassing not only the whole KALI gene but also neighboring 
genomic region including GPR143 (Figure 4b), which is known to be 
associated, when mutated, with ocular albinism phenotypes. 34 

Use of the custom CGH array for diagnosis and genetic counseling 

Following validation by using control DNAs with various gene CNMs, 
we decided to apply it as a diagnostic tool for patients with unknown 
molecular diagnosis and/or waiting for genetic counseling. CNM 
screening strategy varied according to genetic characteristics of the 
pathology (Table 1). 

We applied the custom CGH array as a first diagnostic method to 
investigate the DMD gene in 262 DNA samples from patients (or 
relative women if DNA from patient is not available) for whom 
clinical, biological and histological data were compatible with the 
diagnosis of dystrophinopathy. We identified 102 deletions (90 
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Figure 1 Detection by genomic CGH array of small CNMs and CNVs in different genes, (a) Detection of a hemizygous duplication of 2.2 kb encompassing 
exon 7 of the DMD gene and a duplication of 1.4kb in intron 2 (patient 14). (b) Detection of a heterozygous 1.4kb deletion carrying away exons 25 and 
26 of the CFTR gene. The horizontal axis shows the position along the genome (NCBI36; Hgl8) and the vertical axis the Cy3:Cy5 log 2 ratios. Patient 
sample was fluorescently labeled using Cy3 and control sample using Cy5. Control was sex matched with patient. The arrows indicate the location of the 
copy-number change. 
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Figure 2 Heterozygous deletion of exon 1 of CDKL5 and neighboring genes (patient 27) is identified more reliably and accurately by CGH array compared 
with MLPA. (a) MLPA showing a reduction of the peak corresponding to CDKL5 exon 1 among CDKL5 exonic probe and control probe peaks, (b) Detection 
by genomic CGH array of a 294 kb deletion on chromosome X from intron 1 of the CDKL5 gene to the SCML2 and CXorf20 neighboring genes. 



hemizygous and 12 heterozygous), 40 duplications (31 hemizygous 
and 9 heterozygous) and 2 complex rearrangements in the DMD 
gene. Single-exon CNMs were easily detected as they encompass in 
general part of flanking intronic regions. In contrast, rearrangements 
involving entire genes were visualized as a shift of the baseline 
corresponding to backbone probes in the intergenic regions. Inter- 
estingly, in a case of isolated duplication of exon 44 detected by cDNA 
analysis but not by QF-PCR, CGH array identified the duplication 
and showed that the proximal breakpoint was very close to exon 44. 
The primer used for QF-PCR was outside of the duplicated region, 
accounting for the false-negative result (data not shown). We 
detected, in two DMD families, a complex rearrangement, that is, 
CNMs involving different genomic parts of the gene and/or with 
more than two breakpoint junctions. The first case was the association 
of a duplication of exons 61-62, a duplication of exons 65-67 and 
abnormal values of log 2 ratio for exons 68-79 probes suggesting a 
triplication. Exons 63-64 were normal. We confirmed by real-time 
Q-PCR that exon 63 was normal, exon 67 duplicated and exon 68 
triplicated (Figure 5A). CGH array analyses performed in samples 
from three relative women of different generations showed that the 
rearrangement were stable and transmitted ad integrum. The other 
case of complex rearrangement in the DMD gene was identified in a 
heterozygous female and associated a duplication of exons 8-41, a 
duplication of exons 44-51 and a triplication of exons 42-43, 
confirmed by real-time Q-PCR (data not shown). Breakpoints 
characterized by CGH array were used to choose oligonucleotides 



for sequencing 20 patients with various deletions in the DMD gene 
(article in preparation). The real breakpoints were distant in average 
of 190 bp for the proximal breakpoint (SD = 242bp) and 267 bp for 
the distal breakpoint (SD = 275 bp). In most cases (n = 33 sequences), 
the difference was < 500 bp and sequencing was successful using the 
first set of primers chosen on the basis of CGH array boundaries (see 
Patients and methods section). 

We also used the custom CGH array as a second-line diagnostic 
method to test DNA samples from 83 patients (or relatives when not 
available) with phenotypical data very suggestive of a disease and for 
whom only one point mutation or no point mutation was detected 
(18 Rett - typical or atypical - syndrome or neonatal encephalopathy, 
42 CF or CfTft-related disorder, 23 severe hemophilia A). We 
identified 16 CNMs involving either MECP2 (n = 8), CDKL5 
(n = 2), CFTR (n = 3) and the F8 gene (n = 3). CNMs involving 
MECP2 were two heterozygous deletions in Rett females, five 
hemizygous large duplications including the entire MECP2 gene 
and a large complex duplication-triplication (Figure 5B) in males 
with neonatal encephalopathy. Two heterozygous large deletions of 
CDKL5 and neighboring genomic regions were identified in females 
with atypical Rett syndrome. In CF patients, we identified three 
heterozygous CNMs involving CFTR, including the smallest CNM 
identified in this study (deletion of exons 25 and 26, 1.4 kb) 
(Figure lb). CNMs were identified in the F8 gene in three families 
with severe hemophilia A, two heterozygous CNMs (one deletion of 
two exons, one duplication of eight exons) in carrier females (affected 
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Figure 3 Detection of a somatic mosaicism corresponding to a duplication of exons 61 and 62 of the DMD gene in a carrier female (patient 16). (a) Real- 
time quantitative PCR of exons 61 and 62 performed in genomic DNA from blood of patient 16 (P), suggested that 50% of cells are heterozygous for the 
exons 61 and 62 duplication. DupC designates a control female heterozygous in all cells for duplication of exons 61 and 62 of the DMD gene. N indicates 
a normal control female. The vertical axis shows the relative quantification of the tested DNA compared with the normal control allele, in three different 
PCR experiments, the mean of the three quantifications is indicated above each peak, (b) Detection by genomic CGH array of the mosaic duplication. The 
arrow indicates the location of the duplication that extends on 2.3 kb encompassing exons 61 and 62. The log2 ratio is at +0.3, in accordance with the 
quantification of 50% of heterozygous mutant:wild-type cells. 



males deceased) and a hemizygous deletion of five exons in a male 
patient. 

We also used CGH array for genetic counseling in the family of 
patient 36 affected by Kallmann syndrome. Analysis of his unaffected 
mother revealed that the large deletion (whole KALI gene and 
neighboring genomic region) is not present in leukocytes from the 
mother (Figure 4b). 

DISCUSSION 

CNM detection has improved considerably over the past 15 years. 
Initially, large rearrangements were characterized by Southern blot 
technology, which is a manual and time-consuming method. 
Appeared then fluorescent semi-quantitative PCR techniques and 
real-time quantitative PCR that are more accurate but do not allow 
simultaneous analysis of several genes. MAPH and MLPA techniques 
have initiated the change in the management of CNMs, allowing 
simultaneous analysis of various exons and genes responsible for 
similar diseases. Moreover, the emergence of commercial kits, 
providing better reproducibility, ensured the diffusion of MLPA 
approach. However, commercial MLPA kits do not allow concomitant 
analysis of large number of genes. The development of CGH array 
methodology constituted a real breakthrough. In this study, we 
describe implementation, validation and interests of a custom CGH 
array that analyses the 344 exons from 26 genes tested in our 
diagnostic laboratory. 



For the array design, we decided to use variable density of probes 
depending on gene characteristics. For genes with small genomic size, 
we opted for a HD of probes in exons and introns. For the DMD and 
the CFTR genes, which are frequently analyzed, we chose a HD of 
probes in exons and introns to facilitate identification of intronic 
breakpoints. For the other large genes, we decided to place a lower 
density of probes in introns to save space on the chip. Intergenic 
regions were covered by backbone probes to maintain baseline 
stability between genes. This design appeared to us generally 
satisfactory in terms of sensitivity and specificity. By taking into 
account strict quality criteria, as the madldr value to validate each 
experiment, we did not have any false-negative results among the 38 
control DNA samples. In a second phase of the study, we tested 345 
cases not previously tested and identified 178 exonic CNMs in 
different genes (162 in DMD, 8 in MECP2, 2 in CDKL5, 3 in CFTR 
and 3 in the F8 gene). Single-exon CNMs were clearly identified and 
sensitivity threshold was of 1.4 kb (but no smaller rearrangement was 
available in this series). In some genes like KALI, CNMs are difficult 
to detect by PCR-based techniques because of a homologous gene 
(KALP on chromosome Y in the case of KALI). We correctly 
identified three deletions of the KALI gene, and detected one case 
of neomutation. The fact that oligonucleotide CGH array technique is 
based on the hybridization of DNA on multiple probes spread all 
along the target sequence allows to avoid false-negative and false- 
positive results obtained with the use of PCR techniques and MLPA. 
Our custom CGH array was efficient to determine the size of the 
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Figure 4 Detection of rearrangements involving whole genes and neighboring genomic regions, (a) Heterozygous deletion of 740 kb involving the entire LIS1 
(PAFHB1) gene and neighboring MET10D, KIAA0064 and GARNL4 genes identified in patient 31 with a Mieller Diecker syndrome. The M/Vfand 0R3A2 
genes are not deleted, (b) Hemizygous deletion of the entire KALI gene in patient 35 with Kallman syndrome associated with ocular albinism, 
corresponding to a neomutation event. A 2297-kb deletion is detected in the patient sample and involves not only the KALI gene, but also neighboring 
genomic region including the GPR143 (OA1) gene. Mutations in this gene are known to be associated with ocular albinism phenotypes. The deletion is not 
detected in DNA sample from the mother. 



deleted/duplicated fragments. Accuracy of this design was sufficient to 
allow targeted sequencing of breakpoints of CNMs in the DMD gene, 
as the average distance between breakpoints indicated by CGH and 
those determined by sequencing is < 300 bp. Large CNMs involving 
not only the gene of interest, but also neighboring genes, are not rare 
and identification of the other genes involved can be important for 
phenotype-genotype correlations. In our custom CGH array, presence 
of intergenic (backbone) probes allowed determining large rearrange- 
ments extending outside of the gene region as illustrated for MECP2, 
LIS1 and KALI. In a patient of our series with Kallmann syndrome 
associated with ocular albinism, CGH array allowed to identify a large 
deletion encompassing not only the KALI gene but also the GPR143 
gene, which is known to be associated with ocular albinism 
phenotypes. 34 

Custom oligonucleotide CGH array technology also appears 
particularly efficient to detect and characterize precisely the nature 
of complex rearrangements that are rare but delicate to identify with 
current techniques including RT-PCR. In fact, we identified several 
cases of double duplications or complex rearrangements associating 
duplication and triplication of several exons in different genes (DMD, 
CFTR and MECP2), and demonstrated that their characterization is 
facilitated by simultaneous visualization of the entire rearrangement. 



For instance, we identified in the DMD gene a particularly complex 
rearrangement that associates duplications and triplications of several, 
non-consecutive exons. To our knowledge, and while waiting for 
contributions from large-scale sequencing, only targeted CGH array 
approach allows to easily define extent and exons copy-numbers of 
complex rearrangements that is important for accurate familial 
studies (carrier determination and prenatal diagnosis) and for 
phenotype-genotype correlation studies. Somatic mosaicisms are 
even rarer, but their identification is important for genetic counseling. 
We tested a women suspected of having a somatic mosaicism of a 
duplication of two exons in the DMD gene. The clear visualization of 
the entire duplication in this woman allowed us to confirm that she 
was a carrier for the mutation and to provide a more appropriate 
genetic counseling. We also used CGH array for carrier status 
determination in some families without available index case. 
This approach was particularly conclusive in our experience for 
dystrophinopathies, hemophilia A and Kallmann syndrome carriers, 
but can be applied for all diseases with large rearrangements. 

In addition to its efficiency and sensitivity, this custom CGH array 
is very powerful as it allows simultaneous analysis of a large number 
of exons (344 exons) corresponding to 26 disease genes. The 12-plex 
CGH array can detect CNMs, which are different in localization, type 
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Figure 5 Identification of complex rearrangements. (A) Complex rearrangement in the DMD gene: duplication of exons 61-62, duplication of exons 
65-67, triplication of exons 68-79. (a) CGH array, (b) Confirmation by real-time PCR of exons 63, 67 and 68. (c) Genealogical data. Tested women are 
indicated by an asterisk. The complex rearrangement was stably transmitted. (B) Triplication of the MECP2 gene embedded within a duplication in a male 
patient with severe encephalopathy, (a) CGH array. The log2 ratios (indicated as log2 R) from 0.822 to 1.393 indicated a hemizygous triplication of a 82-kb 
region that includes the entire MECP2 gene, embedded within a duplicated region of 329 kb involving several other genes, (b) Confirmation of the 
triplication by real-time PCR of exons 3 and 4 of the MECP2 gene. Dup C, duplicated control; N, normal control; P, patient. 



and size in a time < 5 days for one experiment of 12 patients. If one 
wants to analyze the 26 genes by MLPA, which is a multistep 
approach, he would need to use eight different kits, with a working 
time of 2 days by kit and a cost at least double per sample. Moreover, 
some of the genes analyzed by our custom CGH array are not 
available on MLPA kits (in particular the EMD gene and tubulin 
genes), and in kits such as the XLMR genes, analyses are not 
exhaustive as probes are present for only some of the exons. 

In conclusion, custom oligonucleotide-based CGH array has an 
undeniable input for diagnosis compared with conventional techni- 
ques by improving reliability and accuracy of CNM detection. The 
possibility of simultaneous analysis of several genes and its scalability 
make it a valuable tool for a new diagnostic approach of CNMs and 
should facilitate the molecular diagnosis of heterogeneous groups of 
diseases such as muscular dystrophies 16 or mental retardation. It is the 
first technology that allows determination of CNM boundaries 
precisely enough to guide targeted sequencing of breakpoints. 35 
Possibility of high scale sequencing of breakpoints will bring real 
progress in understanding molecular mechanisms of rearrangements, 
searching for genotype-phenotype correlations and to guide certain 
therapeutic strategies. 
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